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ABSTRACT 



An embodiment of the present invention locates facial 
features in an image by bandpass filtering the image and then 
performing morphological operations followed by a thresh- 
olding operation. This initial processing identifies candidate 
areas where facial features may be located. The candidate 
areas are evaluated by classifiers to determine if a facial 
feature, such as an eye or mouth, has been located. 
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METHOD FOR LOCATING A SUBJECT'S FIG. 12 illustrates six points used to identify the locations 

LIPS IN A FACIAL IMAGE of the mouth and lips. 

BACKGROUND OF THE INVENTION DETAILED DESCRIPTION 

1 Field of the invention 5 f^G* 1 illustrates a functional block diagram of an appa- 

The present invention relates to image processing, more ratus ,0 locate facial features in an image; however, the 

specifically, locating features in an image. MVenUo ° mav b ° <? ' locate ™\ ° f m r 

r . . image. Camera 10 provides a pixel representation of an 

2. Description of the Related Art tQ memory u Digital signaI processmg de vice (DSP) 

Locating facial features in an image is important in 10 14 processes the image stored in memory 12 to determine 

security applications for automatically identifying me location of facial features. DSP 14 uses memory 16 for 

individuals, in speech recognition for improving recognition program storage and as a scratch pad memory. The pixel data 

rates, and in low bandwidth video telecommunications. In provided by camera 10 may be obtained directly from 

lower bandwidth video communications, an entire image is camera 10 if the camera provides a digital output, or if 

sent or updated every 4 or 5 frames while video information 15 cam era 10 provides an analog output, the pixel data may be 

relating to a speaker's mouth is sent or updated at the full obtained by passing the analog data from camera 10 through 

frame rate. In another low bandwidth video telecommuni- a analog-to-digital converter. The pixel representation of the 

cations application, an entire image is transmitted every 4 or im age may ^ a gra y sca i e i mage 0 r a color image. Memory 

5 frames and the speaker's utterances are used as inputs to 12 may be constructed using any of the widely available 

a model. The model's output is used to modify the appear- 20 random access memories. DSP 14 may be implemented 

ance of a speaker's mouth on the receiving video terminal usmg a digital signal processing chip provided by a manu- 

between frame updates. All of these applications would f aC rurer such as Texas Instruments, or it may be imple- 

beneflt from a more accurate technique to locate facial mented using a microprocessor or using a combination of a 

features, such as a mouth, in an image. microprocessor and a co-processor chip. Memory 16 may be 

In the past, facial features have been located by detecting 25 constructed using a combination of random memory access 

a cornea reflection or by using templates or deformable memory and read-only memory. 

templates. Templates were used by moving the template pjc. 2 illustrates an image captured by camera 10. The 

over an image and defining an energy function that is image is stored in memory 12 in the form of pixels. In the 

minimized when the template closely matches a facial case of a gray scale image, for example, each pixel may have 

feature. Theses techniques did not perform well in a natural an intensity value between 0 and 255. A high intensity 

environment. A natural environment is one in which the indicates a bright pixel, and a low intensity indicates a dull 

lighting varies and the position of a individual's face with or i mage G f piG. 2 may have any scale; 

respect to the camera vanes. however, this embodiment of the invention uses an image 

, c with a scale of 240 pixels in the vertical direction and 360 

SUMMARY OF THE INVENTION 35 ^ the horizontal 

The present invention provides an apparatus and method In order to locate facial features, the image is passed 

for locating facial features in an image. The present inven- through a bandpass filter by convolving the image with a 

tion is tolerant of varying lighting conditions and varying rectangular shape. As the image is convolved with the 

camera positions. An embodiment of the present invention 40 rectangular shape, the original image is retained in memory 

locates facial features in an image by bandpass filtering the as a new image is constructed. FIG. 3 illustrates how an 

image and then performing morphological operations fol- image is convolved with a rectangular shape.* Rectangular 

lowed by a thresholding operation. This initial processing shape 40 is scanned over the entire image pixel by pixel. For 

identifies candidate areas where facial features may be example, rectangular shape 40 is moved one pixel at a time 

located. The candidate areas are evaluated by classifiers to 45 in the horizontal direction until it reaches the end of the 

determine if a facial feature, such as an eye or mouth, has image, and then it is moved down one pixel in the vertical 

been located. direction and again scanned across the image one pixel at a 

time in the horizontal direction. At each position, the inten- 

BRIEF DESCRIPTION OF THE DRAWING sities of pixels 42 contained within rectangle 40 are summed 

^ .„ - . , , , , ,. 50 and averaged. This average is used to construct the image 

FIG. 1 illustrates a functional block diagram of an re | om convoh|tio * ^ intensity value of me ixel 

embodiment of the present invention; fa me Q * w ^ mmspaading to the pixe l at or near the 

FIG. 2 illustrates an image stored in a memory; center 0 f tne rec tangle in the original image is set to the 

FIG. 3 illustrates convolving a shape over a plurality of average. The pixel intensities of the original image are not 

pixels; 55 changed as the new or resulting image is formed. 

FIG. 4 illustrates a histogram of pixel intensity; As mentioned earlier, the image of FIG. 2 is bandpass 

FIG. 5 illustrates a histogram of connected components of filtered using a rectangular shaped that is convolved with the 

pixels; image. This bandpass filtering is carried out in two steps. 

FIG*. 6 illustrates connected components in a thiesholded ^ e fi ,^ s,e P J° w P«f mu>lia & th ? toa « e > 

• 60 convolving the image with a small rectangular shape having 

Jf * „ .„ „ ~ * . * a vertical dimension of two pixels and a horizontal dimen- 

FIG. 7 lUustrates the image of FIG. 2 after bandpass skm of one ^ ^ £ fa Qut Qn ^ ^ of 

filtering, morphological processing and thresholding; nG % {Q produce a lowpass mtered ^ A CQpy of the 

FIG. 8 illustrates candidate eye and mouth locations; lowpass filtered image is stored in memory 16 while DSP 14 

FIG. 9 illustrates an eye and mouth combination having 65 performs another convolution operation on the lowpass 

the best score; filtered image. DSP 14 high pass filters the lowpass filtered 

FIG. 10 and 11 illustrate stored profiles of lips; and image by convolving the lowpass filtered image with a 
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rectangular shape having a vertical dimension of 25 pixels A threshold is selected by making an analysis of the 
and horizontal a dimension of 5 pixels. The resulting high- resulting histogram. In reference to FIG. 5, it should be 
pass filtered image is subtracted from the lowpass filtered noted that if a threshold value is set particularly low, there 
image that was stored in memory 16 to produce a bandpass will be a large number of white pixels in the binary image, 
filtered image which is stored in-memory 16. The subtrac- 5 If a large number of pixels are white, there will be less white 
tion is carried out by subtracting the values of pixels having pixels standing alone, that is, it is more likely that pixels will 
corresponding positions in the images. be in connected components without black pixels breaking 
It should be noted that by changing the dimensions of the up the connected component. As a result, a low threshold 
rectangular shape convolved with the image, the filtering produces a low number of connected components, 
characteristics in the vertical and horizontal directions are 10 Conversely, if the threshold is set high, a large number of 
changed. For example, if the rectangle has a large vertical P^els will be set to black rather than white. If there are a 
dimension, it tends to act as a lowpass filter in the vertical large number of black pixels, connected components of 
direction, and if it has a small vertical dimension, filters out white pixels are more likely to be broken up by the black 
less of the high frequencies. Likewise, if the rectangular P™ls, and as a result, a large number of connected corn- 
element has a large horizontal dimension, it tends to act as is ponents are formed. It is desirable to select a threshold that 
a lowpass filter in the horizontal direction. It should also be produces a reasonable number of connected components that 
noted that the filtering operations may be carried out using may later be identified as a facial feature. If too large a 
fourier transforms rather than convolution operations. number of connected components are formed, it may be 

a ^ ^ • ~ iu L„«j Bn „ • nn difficult to eliminate false candidates for facial features. On 

After performing the bandpass filtering operation on the . . , , . . . ., , 4 * , 

/ CTr ~ * u^i- t ™ f 0 n „rf nrm *A trt 20 the other hand, if the threshold is set to low, too many 

image of FIG. 2, a morphological operation is pertormed to « . * . i , 

u • ~c »u • T «w connected components merge into large connected compo- 

emphasize areas of the image that may contain tacial tea- a f_ % . t J P . , f . . . ^ r 

«. * . ♦ a \ u • i ' ™ nents and thereby smear or eliminate desired facial features. 

tures of interest. A morphological operation involves con- A *u u i j j * • -i ^ • * • • a ♦ u 
volving an image with a rectangle or a shape similar to the * threshold determined during a training procedure, to be 
shapeof the feature to be emphasized, m rectangular shape descnbed ^ representative "™ges. 
used to convolve with the image has a vertical dimension of 25 After the threshold is selected, the morphologically pro- 
two pixels in a horizontal dimension of six pixels. As with cessed image is thresholded by comparing each pixel with 
the other convolution operations, the pixels within the the threshold. If the pixel's intensity is below the threshold, 
rectangular area are summed and averaged. The intensity ^ P^el is set to 0 or black, and if the pixel's intensity is 
value of pixel in the resulting image that corresponds to the greater than or equal to the threshold, the pixel is set to 1 or 
pixel at or near the center of the rectangle is given an 30 white. This process of thresholding the image results in a 
intensity equal to the average value. After the morphological binary image. 

operation is performed, the morphologically processed After the binary image has been formed, the connected 

image is analyzed to determine a threshold. components are examined in an attempt to identify eye 

FIG. 4 illustrates a histogram of a morphologically pro- candidates. Each connected component is measured to deter- 

cessed image where intensity is illustrated on an horizontal mine me connected component's height, width and aspect 

scale and the number of pixels is shown on the vertical scale. ratio (width/height). Each of these three parameters is com- 

This histogram is converted to a histogram which illustrates Pared to an ideal value and the difference between the 

a threshold value on the horizontal axis and the number of parameter and the ideal value is multiplied by a weight and 

"connected components" on the vertical axis. FIG. 5 illus- ^ then summed to form a score for each connected component, 

trates a connected component histogram. Equation 1 illustrates a linear classifier or a scoring 

The term "connected component" refers to the number of process where width differences W is multiplied by weight 

consecutive white pixels in a binary image, where the binary w *i> and then added t0 me product of height differences H 

image is obtained by passing the morphologically processed multiplied by weight w hl which is summed with the product 

image through a threshold process. The thresholding process 45 of aspect ratio differences R multiplied by aspect ratio 

involves setting a pixel to zero or dark if its intensity is weight w rl to form, eye score S e . The weights and ideal 

below the threshold, and setting the pixel to lor white if the value are determined during a training procedure using 

pixel's intensity is above or equal to the threshold. FIG. 6 is representative images. 

a portion of a binary image illustrating a connected compo- _ _ t . 

* * r • i -ij Ww xi+H w ki+R w ,i'^e Equation 1 

nent. A connected component of pixels may include more 50 

than a single row. For example, pixel connected components This process is carried out for each connected component 

84 and 86 constitute a connected component because they m the binary image. The connected components having a 

form a connected component of consecutive white pixels. score within a desirable range, as determined during 

Pixels 82 constitute a connected component because there training, are identified as eye candidates in the original 

are two consecutive white pixels. It should also be noted that 55 image. FIG. 7 illustrates, the connected components that 

pixel 80 constitutes a connected component because it is a have been identified as eye candidates superimposed on the 

connected component of connected component comprising binary image. 

one consecutive white pixel. The above referenced weights, thresholds, ideal values 

A histogram of connected components is developed by and scores are obtained by a training process. The training 

first setting the threshold to one and then counting the 60 process is carried out using a set of training images where 

number of connected components in the resulting binary the locations of the features of interest are known. The set of 

image. The next point in the histogram is found by setting training images should be similar in size and representative 

the threshold to two and counting the number of connected of the images to be processed. For example, the training set 

components in the resulting binary image. This process is might include 20 images having an image of a person's head 

continued for each possible intensity value, in this example 65 where each person is facing in a forward direction toward a 

255 different values, until a completed histogram is devel- camera and where each of the images are gray scale images, 

oped. It is advisable to using a training set on the order of at least 
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20 images. Each of the training images is processed in the eye and mouth positions in the image. Each combination of 

same fashion as described above. The weights and thresh- eye pair candidate and mouth candidate are examined. The 

olds are then adjusted to maximize the ability to locate examination involves determining the distance separating 

features while minimizing errors. For example, a weight is the eyes of the eye pair candidates, and calculating a 

calculated by repeating the above procedure for several 5 mouth/eye ratio. The ratio is calculated by dividing the 

different weight values. At the end of each procedure the distance between the mouth candidate and a line intercon- 

results are checked. The weight value that produced best necting the eye candidates, by the distance between the eye 

results is selected. This process is repeated for all weights, candidates. The distance separating the eyes, and the mouth/ 

thresholds and scores, several times, until a satisfactory eye ratio are each multiplied by a weight and then summed 

result is achieved. Generally speaking, the training process 10 to form a score. Equation 4 illustrates this process. Distance 

is well known in the art as a method for training linear D refers to the separation between the eyes of the eye pair 

classifiers. Training linear classifiers is discussed in refer- candidate. Distance D is multiplied by weight w^. The 

ence "Pattern Classification and Seen Analysis" by Duda product Dw^ is added to the product of ratio T) me and 

and Hart, pages 130-188, John Wiley and Sons, Inc., 1973, weight w me . Ratio D^. is the vertical distance between a 

New York, N.Y. It should be noted that other classifiers, such 15 line, which interconnects the eyes of the eye pair candidate, 

as neural networks, may be used to identify or classify the and the mouth candidate, divided by distance D. The sum of 

connected components. these products form a score for each of the eye pair-mouth 

After eye candidates have been identified, mouth candi- combinations under examination, 

dates are identified. The mouth candidates are identified in _ _ „ _ , . . 

. , . , , . « . ^ w d2 + ^me w me m ^on Equation 4 

a fashion similar to that which was used with regard to eye 20 

candidates. The height, width and aspect ratio of each of the The scores for each combination are compared with an 

connected components are measured and compared to an ideal score. The combination closest to the ideal score is 

ideal value. A score is produced for each of the connected identified as the combination showing the eye and mouth 

components by taking a sum of the weighted height differ- positions. FIG. 9 illustrates original image of FIG. 2 with the 

ence and weighted aspect ratio difference. Equation 2 illus- 25 eye and mouth positions identified. As described above the 

trates this scoring process. Weight w^ is used to weight the weights, ideal values and desirable score range are deter- 

width difference of the connected component, weight w^ is mined by the training process. 

used to weight the height difference of the connected com- It should be noted that if apriori knowledge exists regard- 

ponent and weight w^ is used to weight the aspect ratio ing the person in the image being identified, the weights and 

difference of the connected component. 30 score ranges can be specialized to identify a particular 

. person's features. It should also be noted that if a position of 

Equation 2 a f ace m ^ ^ ^ OWQ apr i 0 ri, it is possible to eliminate 

The connected components having a score within a desir- many eye and mouth candidates simply based on their 

able range are identified as mouth candidates. FIG. 8 illus- location within the image. If the image is part of a video, it 

trates the image of FIG. 2 with the eye position candidates 35 is possible to eliminate some of the eye and mouth candi- 

indicated as circles and mouth position candidates indicated dates by tracking head position or comparing two or more 

as lines. As described above, the weights, ideal values and frames. 

desirable score range are determined by the training process. It is possible to provide low bandwidth telecommunica- 

After eye and mouth candidates have been identified, a tions by transmitting the entire video picture at a relatively 

search is made for eye pair candidates. Eye pair candidates 40 low frame rate while transmitting the portion of the image 

are identified by examining all the previously identified eye surrounding the mouth at a full frame rate. This requires 

candidates. All possible combinations of the eye candidates, identifying the position of the mouth with regard to the rest 

taken two at a time, are examined to produce eye pair of the image, and then transmitting that portion of the image 

candidates. Each pair is examined by measuring the distance at frill frame rate while transmitting the rest of the image at 

between the two eye candidates composing the eye pair 45 a lower frame rate such as every 5th frame, 

under examination, by measuring the orientation of the pair In another low bandwidth video telecommunication 

of eye candidates, and by forming a sum of the scores application, it is desirable to identify lip position with as 

previously calculated for each of the eye candidates com- much accuracy as possible so that a morphing procedure 

posing the eye pair under examination. Equation 3 illustrates may be used to modify the lips in the image presented to the 

how a score is developed for an eye pair. Distance D 50 party receiving the video image. In this way, a complete 

between the eye candidates is multiplied by weight w dl , video image is transmitted every 4 or 5 frames while mouth 

orientation value O is multiplied by weight w 0 , and the sum or lip position is updated using a morphing procedure so that 

of previously calculated scores S en and S^ is multiplied by the image seen by the receiving party seems to be at the full 

weight w f . These products are summed to form a score for frame rate. The morphing is carried out by carefully iden- 

the eye pair candidate. The eye pairs having a score within 55 tifying the lip position on a transmitted frame, and then at the 

a desired range are identified as an eye pair candidate. The receiving end of the video transmission, modifying the lip 

orientation value in equation 3 indicates how close a line positions based on the speaker's utterance. One of the well 

connecting the two eye candidates composing the eye pair is known models for predicting lip motion based on utterances 

to the horizontal. is used to predict a mouth or lip position. Morphing is well 

^ /e e . c „ . , 60 known in the art and discussed in references such as "Digital 

D^^AS^^ Equate Image Wraping „ G Wolbefg> ieeE Computer Society 

The eye pairs having scores within the desirable range are Press, 1990, Los Altamitos, Calif, 

identified as eye pair candidates. As described above the As discussed with regard to morphing, in some applica- 

weights, ideal values and desirable score range are deter- tions it is not only necessary to know the position of the 

mined by the training process. 65 mouth, but it is also desirable to know the position of the 

The eye pair candidates are then used in conjunction with lips. An embodiment of the present invention also provides 

the previously identified mouth candidates to identify the a method and apparatus for finding the position of a person's 
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lips in an image. The lips are located in a manner similar to 
that which was used to locate the eyes and mouth. Initially, 
the location of the mouth which is determined as described 
above. The portion of the image relating to the mouth is 
processed in order to determine the position of the lips and 
a more exact outline of the mouth. In order to minimize 
computational overhead, only the portion of the image 
associated with the mouth is processed in order to locate the 
lips. For example, only the portion of the image containing 
the mouth and a border of 5 or 10 pixels surrounding that 
area is used when determining the exact location of the lips. 

The portion of the original image containing the mouth, is 
bandpass filtered as described above. The image is first 
lowpass filtered by convolving a rectangular shape with the 
image. The rectangular shape may have dimensions such as 
a vertical dimension of two pixels and a horizontal dimen- 
sion of one pixel. A copy of the lowpass filtered image is 
then stored in memory 16 and DSP 14 performs another 
convolution operation on the lowpass filtered image. This 
second convolution highpass filters the lowpass filtered 
image by convolving the lowpass filtered image with a 
rectangular shape having dimensions such as a vertical 
dimension of 25 pixels and a horizontal dimension of 5 
pixels. The resulting highpass filtered image is subtracted 
from the lowpass filtered image that was stored in memory 
16 to produce a bandpass filtered image which is also stored 
in memory 16. The subtraction is carried out by subtracting 
the intensity values of pixels having corresponding positions 
in the images. 

After performing the bandpass filtering operation on the 
portion of the image containing the mouth, a morphological 
operation is performed to emphasize the center of the mouth. 
The morphological operation involves convolving a rectan- 
gular shape with the bandpass filtered image. The rectan- 
gular shape has dimensions such as a vertical dimension of 
1 pixels and a horizontal dimension of 8 pixels. After the 
morphological operation is performed, the morphologically 
processed image is analyzed to determine a threshold. 

An intensity histogram of the morphologically processed 
image is constructed so that on a horizontal scale the number 
of intensity of pixels is illustrated and on the number of 
pixels is illustrated. As discussed earlier, a connected com- 
ponent histogram is developed to illustrate a threshold value 
on the horizontal axis and the number of connected com- 
ponents on the vertical axis. After making an analysis of the 
histogram showing connected component, a threshold is 
selected. The threshold is determined using the previously 
described training procedure. 

Once the threshold is selected, the morphologically pro- 
cessed image is thresholded as discussed earlier to produce 
a binary image. The binary image results in a group of 
connected components that identify the mouth. These con- 
nected components identify the center of the mouth, and the 
left and right ends of the connected components identify the 
left and right edges of the mouth. 

After using the connected components to identify the 
center of the mouth, that portion of the original image or the 
bandpass filtered image is processed. The image is processed 
by examining the vertical cross section through the middle 
of the mouth as identified by the connected components 
from the binary image. The vertical cross section is taken at 
the horizontal middle or position midway between the right 
and left edges of the mouth as identified by the left and right 
edges of the connected components. The cross section is 
taken using a strip that is on the order of 5 pixels wide. 
Variations in contrast or the variation in pixel intensity are 
examined when moving in a vertical direction within this 
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strip. (If the strip is 5 pixels wide, the average intensity of 
the 5 pixels is used in the analysis.) If a large variation in 
intensity occurs, a simple segmentation process is used to 
determine the inner and outer boundaries of both the upper 

5 and lower lip. If the variation in contrast is relatively small, 
the vertical strip is compared with a group of stored profiles 
of vertical cross sections. Variations in intensity are consid- 
ered large if the intensities vary over a range greater then a 
threshold T 19 which is determined using the previously 
described training procedure. Large variations occur typi- 
cally in images where there is lipstick on the lips being 
located. If the variation in intensity is less then threshold T lt 
then the profile matching method is used. 

If the variation in intensity along the vertical cross section 
is larger then threshold T a , segmentation is used. This 

15 method simply involves comparing the intensity of the 
pixels with a threshold T 2 as the examination moves in a 
vertical direction along the vertical cross section. Threshold 
T 2 is determined using the previously described training 
procedure. For example, if the scan is moving in a vertical 

20 direction from bottom to top, the intensity of the pixels are 
monitored, and when the intensity crosses threshold T 2 , the 
outer edge of the lower lip is identified. The scan then 
continues in an upward direction until the intensity once 
again crosses the threshold to identify the inner edge of the 

25 lower lip. The scan then continues vertically until the 
intensity crosses the threshold to indicate the inner edge of 
the upper lip. The scan is then completed by continuing 
vertically until the in intensity crosses the threshold T 2 to 
indicate the outer edge of the upper lip. 

30 The profile matching method involves comparing the 
intensity profile from the image with a collection of stored 
profiles or cross sections in order to find the closest match. 
Once the closest match is found, the inner and outer edges 
of both the upper and lower lips specified by that stored 

35 profile are used to identify those locations of the image 
under examination. 

FIGS. 10 and 11 illustrate stored profiles that are used for 
comparison with the intensity profile under examination. 
The profiles indicate pixel intensity verse vertical position. 

40 Initially, before a comparison is carried out, the profile 
measured from the image under examination is scaled to a 
normalized scale that was used with the stored profiles. 
Normalization is easily carried out because the previously 
determined distance between the eyes is used to determine 

45 the scaling factor. The resulting intensity profile is similar to 
the profiles of FIGS. 10 and 11. The resulting profile is 
compared with stored profiles to find a best match. Charac- 
teristics such as maximum and minimum points on the 
intensity profiles are compared in order to find the best 

50 match. The profiles that are stored and used for comparison 
are obtained by taking known images and processing them 
in the same manner as the image under examination. Train- 
ing images as discussed earlier should be appropriate for the 
problem being addressed, that is, determining the position of 

55 lips in an image. In order to provide a complete training set, 
the images should have lighting and positioning similar to 
those that are expected in images to be examined and the 
images should contain a variety of lip shapes, including 
images where the lips have lipstick on them. The images 

60 should also include images with mouths partially open, with 
teeth showing and with a tongue showing. In a case where 
a near match results in an error, that is, the position of the 
lips in the image being incorrectly indicated, it is advisable 
to add the image, with the correct lip positions, to the stored 

65 set of profiles for future comparison. 

By using this method, the mouth and the lips are defined 
with 6 points. FIG. 12 illustrates a mouth where points 100 
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and 102 define the left and right edges of the mouth. Points comparing the intensity profile to a plurality of stored 

104 and 106 define the outer and inner edges of the upper lip, intensity profiles to locate an edge of at least one of the 

respectively, and points 108 and 110 define the inner and lips if the intensity variation is below a first threshold, 

outer edge of the lower Up, respectively. Specifying these 6 and comparing the intensity profile to a second thresh- 

points provides an accurate positioning for use in lowband 5 old to locate the edge of at least one of the Hps if the 

video communication such as those employing the mor- , intensity variation is above the first threshold, 

phine method 2 * ^ e metnod of claim wherein the step of morpho- 

The invention claimed is: £f ca ^ processing comprises convolving the bandpass 

1 a *u a f 1 « • |. • „ . nnm filtered uiiage with a rectangular shape having vertical and 

1. A method for locating a subject s hps in a image, ^ tQ me ^ ject >s m outh. 

comprising the steps of: 10 3 Tfae method of daim x whcrein me step of bandpass 

bandpass filtering the image to produce a bandpass fil- filtering comprises lowpass filtering the image to produce a 

tered image; lowpass filtered image, highpass filtering the lowpass fil- 

morphologically processing the bandpass filtered image tered image to produce a highpass filtered image, and 

to produce an enhanced image by convolving the subtracting the highpass filtered image from the lowpass 

bandpass filtered image with a rectangular shape; 15 filtered image to produce the bandpass filtered image. 

. . ,,. lL , , . t c . 4. The method of claim 1, wherein the step of usine a 

thresholding the enhanced image to form a binary image , * . . ' ^ B 

, i v* f „ ^ . „ . classifier comprises using a lmear classiner. 

having a plurality of connected components and _ _ f , i . t_ . . * ■ 

° r J r 5. The method of claim 1, wherein the step of using a 

using a classifier to identify at least one connected com- classifier comprises using a neural network. 

ponent corresponding to the subject's mouth; 20 6 ^ method of daim t wherein ^ step of bandpass 

determining an intensity profile along a vertical strip filtering comprises convolving the image with a rectangular 

across the image of the subject's mouth; shape, 
determining an intensity variation of the intensity profile; 

and ***** 
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